-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds correct UTF-8 encoding to Net::BER::BerIdentifiedString #242
Conversation
|
||
assert bis.valid_encoding?, "should be a valid encoding" | ||
assert_equal "UTF-8", bis.encoding.name | ||
end | ||
|
||
def test_ut8_data_in_utf8 | ||
def test_utf8_data_in_utf8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing this.
@andibachmann thank you for taking the time to open this pull request and including helpful comments. I think this should be fine and compatible with the past changes in #212 because we're changing the internal string object. Is there anything else you would like me to review specifically before I merge? |
@jch, besides special Encodings like Japanese, Chinese and Korean, I'm pretty sure the code is OK. |
Yes, I have a monkey patch as well as your changes does :) |
Thanks all for chiming in. Merging. |
Adds correct UTF-8 encoding to Net::BER::BerIdentifiedString
Adds correct UTF-8 encoding to Net::BER::BerIdentifiedString
=== Net::LDAP 0.13.0 * Set a connect_timeout for the creation of a socket {#243}[ruby-ldap/ruby-net-ldap#243] * Update bundler before installing gems with bundler {#245}[ruby-ldap/ruby-net-ldap#245] * Net::LDAP#encryption accepts string {#239}[ruby-ldap/ruby-net-ldap#239] * Adds correct UTF-8 encoding to Net::BER::BerIdentifiedString {#242}[ruby-ldap/ruby-net-ldap#242] * Remove 2.3.0-preview since ruby-head already is included {#241}[ruby-ldap/ruby-net-ldap#241] * Drop support for ruby 1.9.3 {#240}[ruby-ldap/ruby-net-ldap#240] * Fixed capitalization of StartTLSError {#234}[ruby-ldap/ruby-net-ldap#234]
Version 0.14.0 of the gem (actually, starting with 0.13.0) contains a code change that fixes an encoding error (Encoding::UndefinedConversionError) that happens when there are extended characters in a dn. The fix forces utf-8 encoding instead of ASCII-8BIT for objects returned from the directory. See ruby-ldap/ruby-net-ldap#242 https://bugzilla.redhat.com/show_bug.cgi?id=1367600
Dear Net-LDAP maintainers
Currently, net-ldap returns values containing umlauts (e.g. 'Müller') with an encoding as 'ASCII-8BIT'.
This is wrong. LDAP in version 3 should encode all data in 'UTF-8' and therefore when
casting returned ''string'' data into a
Net::BER::BerIdentifiedString
the encoding should be 'UTF-8'.The current code tries to '#encode('UTF-8')' which results in an
Encoding::UndefinedConversionError: "\xC3" from ASCII-8BIT to UTF-8
. This error is trappedand the binary string is returned instead.
If we assume that the data coming from our LDAP/AD-Server is correctly encoded in 'UTF-8', there is
no need to use
#encode
. The only thing is to set correctly the encoding of the (string) data, i.e. to useforce_encoding
.Example:
I must admit that I have only checked the code (and added one test case) for Umlauts (and characters more or less covered by ISO-8859-1). I have no idea how to test against
Korean, Japanese, Russian, or Chinese encodings.
Nevertheless, I am pretty sure that the current code is bogus for any 'non-ASCII' characters.
Please let me know, if you need further details and I'd be happy to help you out.
regards
andi